新兴的沟通研究通常着重于优化特定于任务的效用作为沟通的驱动力。但是,通过优化信息和复杂性之间的信息瓶颈权衡,人类语言似乎在压力下发展,以有效地将含义压缩到通信信号中。在这项工作中,我们研究了如何交换这三个因素 - 效用,信息性和复杂性 - 与人类交流相比,包括新兴的沟通。为此,我们提出了矢量定量的变分信息瓶颈(VQ-VIB),这是一种训练神经剂将输入压缩到嵌入连续空间中的离散信号的方法。我们通过VQ-VIB训练代理商,并将其性能与以前建议的神经体系结构在接地环境和刘易斯参考游戏中进行比较。在所有神经体系结构和环境中,考虑到沟通信息有益的沟通融合率,并惩罚交流复杂性会导致类似人类的词典大小,同时保持高效用。此外,我们发现VQ-VIB优于其他离散通信方法。这项工作表明,人们认为人类语言进化的基本原理如何为人工代理中的新兴沟通提供信息。
translated by 谷歌翻译
从演示(LFD)方法中学习显示了解决多步任务的希望;但是,这些方法不能保证在给定干扰的情况下成功复制任务。在这项工作中,我们确定了这一挑战的根源,例如学习的连续政策失败无法满足演示中隐含的离散计划。通过利用模式(而不是子观念)作为具有模式不变性和目标达到性能属性的离散抽象和运动策略,我们证明我们所学的连续策略可以模拟由线性时间逻辑(LTL)公式指定的任何离散计划。因此,模仿者对任务和运动级别的干扰都具有鲁棒性,并保证取得任务成功。项目页面:https://sites.google.com/view/ltl-ds
translated by 谷歌翻译
神经原理模型对于NLP任务的可解释预测很受欢迎。在其中,选择器提取了输入文本的片段,称为理由,并将这些段传递给分类器进行预测。由于基本原理是分类器可访问的唯一信息,因此可以将其定义为解释。这样的表征无条件正确吗?在本文中,我们与相反的论点说,哲学观点和经验证据都表明,理由模型也许比预期的不太理性和可解释。我们呼吁对这些模型进行更严格和全面的评估,以确保确实实现了可解释性的所需属性。该代码可以在https://github.com/yimingz89/neural-rationale-analysis中找到。
translated by 谷歌翻译
在强化学习培训的设置代理神经学可以通过分立令牌相互通信,实现作为一个团队有哪些代理将无法独自做到。然而,使用一个热向量作为离散的通信的当前标准从获取作为零次理解通信这样的更理想的方面令牌防止剂。通过嵌入一词从自然语言处理技术的启发,我们提出了神经代理架构,使他们能够通过从了解到,连续的空间衍生离散令牌进行通信。我们显示了在决策理论框架,我们的技术优化通信在大范围的场景,而一个热令牌是唯一最佳的下严格的假设。在自我发挥的实验,我们验证了我们的培训的工作人员学习集群令牌语义有意义的方式,让他们在其他技术无法嘈杂的环境中交流。最后,我们证明这两种,用我们的方法代理可以有效地应对新的人际交往和人类可以理解未标记的应急代理通信,跑赢使用一个热的沟通。
translated by 谷歌翻译
特征归因方法在可解释的机器学习中受欢迎。这些方法计算每个输入特征的归属来表示其重要性,但没有关于“归因”的定义的共识,导致许多竞争方法,缺乏地面真理归因,特别是缺乏地面真实的归因。为了解决这个问题,我们提出了一个数据集修改程序来诱导如此的实践。使用此过程,我们评估三种常见方法:显着性图,理由和注意。我们确定了几种缺陷,向越来越多的证据质疑这些方法在野外数据集上应用这些方法的正确性和可靠性来添加新的视角。我们进一步讨论可能的补救途径,并在部署之前推荐以对地面真理进行测试的新归因方法。代码可在https://github.com/yilunzhou/feature --attribution-evaluation
translated by 谷歌翻译
When robots learn reward functions using high capacity models that take raw state directly as input, they need to both learn a representation for what matters in the task -- the task ``features" -- as well as how to combine these features into a single objective. If they try to do both at once from input designed to teach the full reward function, it is easy to end up with a representation that contains spurious correlations in the data, which fails to generalize to new settings. Instead, our ultimate goal is to enable robots to identify and isolate the causal features that people actually care about and use when they represent states and behavior. Our idea is that we can tune into this representation by asking users what behaviors they consider similar: behaviors will be similar if the features that matter are similar, even if low-level behavior is different; conversely, behaviors will be different if even one of the features that matter differs. This, in turn, is what enables the robot to disambiguate between what needs to go into the representation versus what is spurious, as well as what aspects of behavior can be compressed together versus not. The notion of learning representations based on similarity has a nice parallel in contrastive learning, a self-supervised representation learning technique that maps visually similar data points to similar embeddings, where similarity is defined by a designer through data augmentation heuristics. By contrast, in order to learn the representations that people use, so we can learn their preferences and objectives, we use their definition of similarity. In simulation as well as in a user study, we show that learning through such similarity queries leads to representations that, while far from perfect, are indeed more generalizable than self-supervised and task-input alternatives.
translated by 谷歌翻译
We address the problem of extracting key steps from unlabeled procedural videos, motivated by the potential of Augmented Reality (AR) headsets to revolutionize job training and performance. We decompose the problem into two steps: representation learning and key steps extraction. We employ self-supervised representation learning via a training strategy that adapts off-the-shelf video features using a temporal module. Training implements self-supervised learning losses involving multiple cues such as appearance, motion and pose trajectories extracted from videos to learn generalizable representations. Our method extracts key steps via a tunable algorithm that clusters the representations extracted from procedural videos. We quantitatively evaluate our approach with key step localization and also demonstrate the effectiveness of the extracted representations on related downstream tasks like phase classification. Qualitative results demonstrate that the extracted key steps are meaningful to succinctly represent the procedural tasks.
translated by 谷歌翻译
An oft-cited open problem of federated learning is the existence of data heterogeneity at the clients. One pathway to understanding the drastic accuracy drop in federated learning is by scrutinizing the behavior of the clients' deep models on data with different levels of "difficulty", which has been left unaddressed. In this paper, we investigate a different and rarely studied dimension of FL: ordered learning. Specifically, we aim to investigate how ordered learning principles can contribute to alleviating the heterogeneity effects in FL. We present theoretical analysis and conduct extensive empirical studies on the efficacy of orderings spanning three kinds of learning: curriculum, anti-curriculum, and random curriculum. We find that curriculum learning largely alleviates non-IIDness. Interestingly, the more disparate the data distributions across clients the more they benefit from ordered learning. We provide analysis explaining this phenomenon, specifically indicating how curriculum training appears to make the objective landscape progressively less convex, suggesting fast converging iterations at the beginning of the training procedure. We derive quantitative results of convergence for both convex and nonconvex objectives by modeling the curriculum training on federated devices as local SGD with locally biased stochastic gradients. Also, inspired by ordered learning, we propose a novel client selection technique that benefits from the real-world disparity in the clients. Our proposed approach to client selection has a synergic effect when applied together with ordered learning in FL.
translated by 谷歌翻译
This paper tackles the challenging problem of automating code updates to fix deprecated API usages of open source libraries by analyzing their release notes. Our system employs a three-tier architecture: first, a web crawler service retrieves deprecation documentation from the web; then a specially built parser processes those text documents into tree-structured representations; finally, a client IDE plugin locates and fixes identified deprecated usages of libraries in a given codebase. The focus of this paper in particular is the parsing component. We introduce a novel transition-based parser in two variants: based on a classical feature engineered classifier and a neural tree encoder. To confirm the effectiveness of our method, we gathered and labeled a set of 426 API deprecations from 7 well-known Python data science libraries, and demonstrated our approach decisively outperforms a non-trivial neural machine translation baseline.
translated by 谷歌翻译
Using a comprehensive sample of 2,585 bankruptcies from 1990 to 2019, we benchmark the performance of various machine learning models in predicting financial distress of publicly traded U.S. firms. We find that gradient boosted trees outperform other models in one-year-ahead forecasts. Variable permutation tests show that excess stock returns, idiosyncratic risk, and relative size are the more important variables for predictions. Textual features derived from corporate filings do not improve performance materially. In a credit competition model that accounts for the asymmetric cost of default misclassification, the survival random forest is able to capture large dollar profits.
translated by 谷歌翻译